Multi-host management server in storage system, program for the same and path information management method

ABSTRACT

Management arrangements to: (A) receive plural failure information from plural host computers for a predetermined period; (B) store the failure information; (C) extract one or more of the plural failure information, received from a first host computer among the plural host computers; (D) retrieve the failure information about one path from the extracted failure information, about multiple paths; (E) register the first host computer via refresh information in the memory, refresh information indicating a host computer of which path information is to be updated; (F) send a request to the first host computer to acquire a status of a first path of the first host computer; (G) update a first path information in the plurality of path information of the first host computer, based on the status; and (H) delete the one or more of the plurality of failure information extracted in (C), from the failure reception information.

CROSS REFERENCE TO RELATED APPLICATION

This is a continuation of U.S. application Ser. No. 11/969,327, filedJan. 4, 2008. This application relates to and claims priority fromJapanese Patent Application No. 2007-169751, filed on Jun. 27, 2007. Theentirety of the contents and subject matter of all of the above isincorporated herein by reference.

BACKGROUND

1. Field of the Invention

The invention relates generally to a storage system. More specifically,the present invention relates to a multi-host management server forefficiently updating path information in a storage system in which aplurality of paths is set between a plurality of hosts and a storageapparatus(es).

2. Description of Related Art

In companies or similar environments, a storage system is constructed byconnecting a plurality of storage apparatuses to a host(s) using pathsvia a SAN (Storage Area Network) in order to store and manage a largeamount of data. Typically, not a single path but multiple paths are setbetween the host(s) and the storage apparatuses, which is generallyreferred to as “multi-path.”

Each host operates software for managing the multi-path to realizefunctions such as path configuration detection, path failure detectionand path switching (hereinafter referred to as multi-path managementsoftware).

For a storage system having a management computer connected to hosts andstorage apparatuses via a SAN, a technique for managing the multi-pathhas been disclosed, where when receiving a report about a path failurefrom a host system or a storage apparatus, the management computercommands the relevant host or storage apparatus to set a new pathdefinition (see JP2007-72571 A).

In a large-scale system environment, a multi-host management server isprovided for collectively managing and monitoring multiple paths set toeach host in order to check if each host is operating normally. Themulti-host management server stores path information for each host. Inorder to keep the path information stored in the multi-host managementserver up to date, a user manually operates the multi-host managementserver to issue a query to each host and update the relevant pathinformation. The processing—the multi-host management server updatingpath information stored in the multi-host management server based onpath information obtained from each host—is generally called “hostrefresh.”

In the above arrangement, each host monitors the statuses of paths setto the host itself. When a host detects path failure, this host reportsfailure information about the failure to the multi-host managementserver. However, a user cannot obtain up-to-date path information unlessthe user manually executes a host refresh. More specifically, althoughthe user can recognize the path failure by receiving the failureinformation, the user cannot recognize which path the failure hasoccurred on without acquiring the up-to-date path information.

As described above, in the related art, the user has to manually executea host refresh after the multi-host management server receives failureinformation from the relevant host, which is troublesome for the user.Accordingly, the multi-host management server is preferably set up sothat it can automatically perform the host refresh.

However, if the multi-host management server executes a host refreshevery time it receives failure information and as many times as thenumber of pieces of received failure information, unnecessary transfersof path information will occur. In particular, there is a kind offailure in which the path status is instantaneously switched betweennormal status and failure status (hereinafter referred to asinstantaneous path interruption), and this type of failure can occurseveral number of times in a short time period, so executing a hostrefresh every time a failure occurs will result in a large load on thenetwork.

If the failure information contains many pieces of path information, themulti-host management server can update the path information stored inthe multi-host management server only by referring to the failureinformation. However, the increase in an amount of failure informationdata will increase the load on the network, so the amount of failureinformation data is not something that can be indiscriminatelyincreased.

In light of these circumstances, there has been demand for a storagesystem capable of performing the host refresh automatically and withoutresulting in a large load on a network.

SUMMARY

In the light of the above problems, it is an object of the presentinvention to provide a multi-host management server capable ofefficiently and automatically performing host refresh for updating pathinformation by keeping failure information received from an arbitraryhost for a predetermined time period and retrieving one piece of failureinformation from those received from the same host during thepredetermined time period.

In order to solve the problem above, provided according to an aspect ofthis invention is a multi-host management server that manages aplurality of hosts with a plurality of paths set between the hosts and astorage apparatus and stores path information for each host, themulti-host management server including: a reception section thatreceives failure information about the paths from the hosts and storesthe received failure information in a failure information receptionqueue; an extraction section that extracts plural pieces of failureinformation about a plurality of paths received from a common host fromthe failure information reception queue; a retrieval section thatretrieves failure information about one path from the extracted pluralpieces of failure information about the plurality of paths; aregistration section that registers, based on the retrieved failureinformation about the one path, information indicating a host relevantto this failure information in a host refresh queue for updating pathinformation; a deletion section that deletes, after the registration ofthe host-indicating information in the host refresh queue, the pluralpieces of failure information about the plurality of paths received fromthe common host from the failure information reception queue; and anexecution section that executes, based on the host-indicatinginformation, update of path information for the relevant host.

With the above arrangement, the host refresh does not have to beexecuted every time or soon after the multi-host management serverreceives the failure information. If, after receiving the failureinformation report from a certain host, new failure information isreceived from the same host, the multi-host management server retrievesfailure information for one path and executes a host refresh.Accordingly, the number of times host refresh is executed can bereduced. In addition, if failures occur several times in a short timeperiod, like in the case of instantaneous path interruption, the numberof times host refresh is executed can be greatly reduced.

Provided according to another aspect of this invention is a multi-hostmanagement server that manages a plurality of hosts with a plurality ofpaths set between the hosts and a storage apparatus and stores pathinformation for each host, the multi-host management server including: areception section that receives failure information about the paths fromthe hosts and stores the received failure information in a failureinformation reception queue; a screen display section that refers to ahost information table storing information indicating whether or notpath information is up-to-date for each host and displays, after failureinformation about a path is received from a certain host, informationindicating whether or not path information for the relevant host isup-to-date, on a screen of the multi-host management server; anextraction section that extracts from the failure information receptionqueue plural pieces of failure information about a plurality of pathsreceived from a common host after a predetermined time period fromreception times of the plural pieces of failure information in thefailure information reception queue; a retrieval section that retrievesfailure information about one path from the extracted plural pieces offailure information about the plurality of paths; a registration sectionthat registers, based on the retrieved failure information about the onepath, information indicating a host relevant to this failure informationin a host refresh queue for updating path information; a deletionsection that deletes, after the registration of the host-indicatinginformation in the host refresh queue, the plural pieces of failureinformation about the plurality of paths received from the common hostfrom the failure information reception queue; and an execution sectionthat executes, based on the host-indicating information, update of pathinformation for the relevant host.

With the above arrangement, the host refresh does not have to beexecuted every time or soon after the multi-host management serverreceives the failure information. If, after receiving the failureinformation report from a certain host, new failure information isreceived from the same host, the multi-host management server retrievesfailure information for one path and executes a host refresh.Accordingly, the number of times host refresh is executed can bereduced. In addition, if failures occur several times in a short timeperiod, like in the case of instantaneous path interruption, the numberof times host refresh is executed can be greatly reduced. Furthermore, auser can recognize whether or not information for the paths set to anarbitrary host is up-to-date.

Provided according to another aspect of this invention is a multi-hostmanagement program executed by a multi-host management server thatmanages a plurality of hosts with a plurality of paths set between thehosts and a storage apparatus and stores path information for each host,the program operating the multi-host management server to perform thefollowing processing: receiving failure information about paths from thehosts and stores the received failure information in a failureinformation reception queue; extracting from the failure informationreception queue plural pieces of failure information about a pluralityof paths received from a common host; retrieving failure informationabout one path from the extracted plural pieces of failure informationabout the plurality of paths; registering, based on the retrievedfailure information about the one path, information indicating a hostrelevant to this failure information in a host refresh queue forupdating path information; deleting, after the registration of thehost-indicating information in the host refresh queue, the plural piecesof failure information about the plurality of paths received from thecommon host from the failure information reception queue; and updating,based on the host-indicating information, path information for therelevant host.

With the above arrangement, the host refresh does not have to beexecuted every time or soon after the multi-host management serverreceives the failure information. If, after receiving the failureinformation report from a certain host, new failure information isreceived from the same host, the multi-host management server retrievesfailure information for one path and executes a host refresh.Accordingly, the number of times host refresh is executed can bereduced. In addition, if failures occur several times in a short timeperiod, like in the case of instantaneous path interruption, the numberof times host refresh is executed can be greatly reduced

Provided according to another aspect of this invention is a pathinformation management method performed by a multi-host managementserver that manages a plurality of hosts with a plurality of paths setbetween the hosts and a storage apparatus and stores path informationfor each host, the method including the steps of: a reception step ofreceiving failure information about paths from the hosts and stores thereceived failure information in a failure information reception queue; ascreen display step of referring to a host information table storinginformation indicating whether or not path information is up-to-date foreach host and displaying, after failure information about a path isreceived from a certain host, information indicating whether or not pathinformation for the relevant host is up-to-date on a screen of themulti-host management server; an extraction step of extracting from thefailure information reception queue plural pieces of failure informationabout a plurality of paths received from a common host, after apredetermined time period from reception times of the plural pieces offailure information in the failure information reception queue; aretrieval step of retrieving failure information about one path from theextracted plural pieces of failure information about the plurality ofpaths; a registration step of registering, based on the retrievedfailure information about the one path, information indicating a hostrelevant to this failure information in a host refresh queue forupdating path information; a deletion step of deleting, after theregistration of the host-indicating information in the host refreshqueue, the plural pieces of failure information about the plurality ofpaths received from the common host from the failure informationreception queue; and an execution step of executing, based on thehost-indicating information, update of path information for the relevanthost.

With the above arrangement, the host refresh does not have to beexecuted every time or soon after the multi-host management serverreceives the failure information. If, after receiving the failureinformation report from a certain host, new failure information isreceived from the same host, the multi-host management server retrievesfailure information for one path and executes a host refresh.Accordingly, the number of times host refresh is executed can bereduced. In addition, if failures occur several times in a short timeperiod, like in the case of instantaneous path interruption, the numberof times host refresh is executed can be greatly reduced.

EFFECT OF INVENTION

The multi-host management server of this invention is arranged totemporarily store failure information in a queue and execute a hostrefresh after a predetermined time period. With the arrangement, sincethe multi-host management server retrieves only the latest failureinformation from plural pieces of failure information for a common hostand deletes the remaining old failure information, the number of timeshost refresh is indiscriminately executed can be reduced. In particular,when a plurality of failures occurs in a short time period, like in thecase of instantaneous path interruption, the load on a network can begreatly reduced.

In addition, the multi-host management server in this invention isarranged so that, when the host refresh has not been executed even afterthe reception of the failure information, a user can recognize thatsituation. This arrangement can avoid problems in which a user performsan operation using a path having a failure (e.g., disconnection) withoutbeing aware of the failure and causes trouble to occur in the host'sprocessing.

Other aspects and advantages of the invention will be apparent from thefollowing description and the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is an overall conceptual diagram showing a storage systemaccording to an embodiment of this invention.

FIG. 2 is an overall block diagram showing the storage system accordingto the above embodiment.

FIG. 3 is a diagram showing a host information table managed by amulti-host management server according to the above embodiment.

FIG. 4 is a diagram showing a path information table managed by themulti-host management server according to the above embodiment.

FIG. 5 is a conceptual diagram showing how the multi-host managementserver requests path information according to the above embodiment.

FIG. 6 is a diagram showing a failure information table managed by themulti-host management server according to the above embodiment.

FIG. 7 is a conceptual diagram showing how the multi-host managementserver receives the failure information according to the aboveembodiment.

FIG. 8 is a diagram showing a path information table managed by a hostaccording to the above embodiment.

FIG. 9 is a block diagram showing the function of multi-host managementsoftware according to the above embodiment.

FIG. 10 is an illustration showing a management screen for iconsindicating hosts according to the above embodiment.

FIG. 11 is an illustration showing a management screen for the pathinformation table according to the above embodiment.

FIG. 12 is an illustration showing a management screen for the failureinformation table according to the above embodiment.

FIG. 13 is a flowchart showing failure reception processing according tothe above embodiment.

FIG. 14 is another flowchart showing the failure reception processingaccording to the above embodiment.

FIG. 15 is a conceptual diagram showing the failure reception processingaccording to the embodiment.

FIG. 16 is a flowchart showing path information update processingaccording to the above embodiment.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS (1) Configuration ofStorage System

FIG. 1 is a conceptual diagram showing the entire system of a storagesystem according to an embodiment of this invention.

The reference numeral 1 denotes the storage system of this invention.The storage system 1 of this invention has the configuration in which amulti-host management server 2 is connected to a plurality of hosts 3via an IP network 8 and the plurality of hosts 3 is connected to aplurality of storage apparatuses 4 using paths P via a SAN 7.

In the SAN 7, HBA (Host Bus Adapter) ports 12 on the host 3 side and CHA(Channel Adapter) ports 20 on the storage apparatus 4 side are connectedto each other by fibre cables.

In this specification, the “path P” may be a logical path or a physicalpath.

FIG. 2 is an overall block diagram showing the storage system 1 of thisinvention.

A multi-host management server 2 is a server for managing the entiresystem, which includes a processor 20, a memory 21 and memory device 23.In addition, the multi-host management server 2 includes: a managementscreen S1 that displays a GUI (Graphical User Interface) for settingvarious settings or issuing an execution command to each host 3; and aninput apparatus (not shown) such as a keyboard and a mouse used forvarious types of operation, input and setting.

A memory 21 stores various control programs that are read out from thememory device 23 when the multi-host management server is started. Themulti-host management server 2 executes various types of processing byoperating the processor 20 to execute these control programs. Multi-hostmanagement software 22 is one of the control programs and stored in thememory 21.

The multi-host management software 22 is software for monitoring pathsset to each host 3. When the multi-host management server 2 detects afailure in a certain path P, the multi-host management server 2 receivesfrom the relevant host 3 at least part of some path information, i.e., ahost name and a path identifier, as failure information.

The memory device 23, which may be a hard disk or the like, storesvarious control programs and various control parameters. The memorydevice 23 also stores a host information table 24, an integrated pathinformation table 25 and a failure information table 26 (each will bedescribed later). As the management screen S1, for example, a CRT(Cathode-Ray Tube) or a liquid crystal monitor may be used.

Each host 3 includes information processing resources such as aprocessor 30, a memory 31, a plurality of host bus adapters 32 and amemory device 36. The memory 31 stores operation software 33, and thissoftware is operation application software. The host 3 performspredetermined operation processing by operating the processor 30 toexecute the operation software. The memory 31 also stores multi-pathmanagement software 34 and a path information table 35, which will bedescribed later.

The host bus adapters 32 are interfaces the host 3 uses to access thestorage apparatuses 4 via the SAN 7, and an FC (Fibre Channel) card orthe like may be employed for each host bus adapter 32.

The storage apparatuses 4 include a plurality of channel adapters 40, acontroller 41 and a plurality of memory devices (not shown).

The channel adapters 40 are interfaces the storage apparatuses 4 use tocommunicate with the hosts 3 via the SAN 7, and an FC card or the likemay be employed for each channel adapter 40. The channel adapters 40 areeach provided with one or more ports, each port being assigned a networkaddress such as a WWN (World Wide Name) and an IP (Internet Protocol)address for identifying the port on the SAN 7.

The memory devices (not shown) installed in the storage apparatuses 4may be expensive disk drives such as FC disks, or inexpensive diskdrives such as SATA (Serial AT Attachment) disks and optical discdrives. One or more logical volumes LU are defined in storage areasprovided by one or more memory devices. The logical volume LU is anexternal volume for externally storing data in the host 3 and consistsof blocks of predetermined size, and the host 3 reads/writes datato/from each of the blocks. These memory devices are used in combinationwith memory devices (internal volumes) 36 for internally storing data inthe hosts 3.

Each logical volume LU is assigned a unique identifier (LUN: LogicalUnit Number). In this embodiment, data is input or output by designatingan address, which is a combination of this identifier and a uniquenumber (LBA: Logical Block Address) assigned to each of the blocks.

The controller 41 includes information processing resources such as aCPU (Central Processing Unit) and a memory, and controls data input ordata output to/from the logical volumes LU in response to a request fromthe hosts 3.

This invention is characterized in that the multi-host management server2 temporarily stores failure information from the hosts 3 in a queue inthe multi-host management software 22, retrieves the latest failureinformation from pieces of failure information received from a commonhost after a predetermined time period, extracts host name informationfrom the retrieved failure information, and executes host refresh on therelevant host 3.

First, the host information table 24, the integrated path informationtable 25 and the failure information table 26 managed by the multi-hostmanagement server 2, which realize the above characteristic, will bedescribed in detail.

(2) Host Information Table

As shown in FIG. 3, the host information table 24 is a table formanaging information about each host 3 and includes “HOST” fields 24A,“IP address” fields 24B, “version” fields 24C and “flag” fields 24D.

The “HOST” field 24A, the “IP address” field 24B and the “version” field24C respectively store the host name, the IP address and the softwareversion of the multi-path management software 34 for each host 3. The“flag” field 24D stores information about whether the path informationmanaged by each host 3 is up-to-date in the multi-host management server2. In the “flag” field 24D of this embodiment, a flag “0” is set whenthe path information is up-to-date, while a flag “1” is set when thepath information is not up-to-date.

(3) Path Information Table

As shown in FIG. 4, the integrated path information table 25 is a tablefor integrally managing path information for all hosts 3, for each ofwhich a plurality of paths P is set. Specifically, the integrated pathinformation table 25 includes “PATH” fields 25A, “HOST” fields 25B, “HBAP” fields 25C, “IVOL” fields 25D, “STRG” fields 25E, “CHAP” fields25F, “EVOL” fields 25G and “STAT” fields 25H.

The “PATH” field 25A, the “HOST” field 25B, the “HBAP” field 25C and the“IVOL” field 25D respectively store the identifier for a path P, thehost name for a host 3, the identifier for a HBA port for a host 3 andthe identifier for a memory device 36 (internal volume). The “STRG”field 25E, the “CHAP” field 25F and the “EVOL” field 25G respectivelystore the storage name for a storage apparatus 4, the identifier for aCHA port and the identifier for a logical volume LU (external volume).The “STAT” field 25H stores a path status indicating whether the path Pis online or not.

For example, as shown in FIG. 5, when the multi-host management server 2requests acquisition or update of path information for each host 3, thehost 3 that receives the request for the path information responds tothe request. Then, the host 3 transmits the path information it managesto the multi-host management server 2. When the multi-host managementserver 2 receives this path information, the processor 20 in themulti-host management sever 2 runs the multi-host management software22. The processor 20 reads the integrated path information table 25stored in the memory device 23 and updates the path information of therelevant host 3 in the integrated path information table 25.

(4) Failure Information Table

As shown in FIG. 6, the failure information table 26 is a table formanaging the failure information that the multi-host management server 2receives from each host 3. The failure information table 26 includes“HOST” fields 26A showing the host name of the host 3 connected to thepath P in which a failure has occurred, “PATH” fields 26B showing theidentifier for the path having the failure, and “DATE” fields 26Cshowing time information about the time the host 3 received the failureinformation. The “DATE” fields 26C may alternatively store timeinformation about the time the host 3 detects the failure.

For example, as shown in FIG. 7, when a failure occurs in a certain pathPF in a plurality of paths P, the relevant host 3 provides themulti-host management server 2 with failure information about the pathPF having the failure. This failure information provided by the relevanthost 3 includes the name of the relevant host 3, the identifier for thepath PF, and time information about the time the relevant host 3received the failure information. When the multi-host management server2 receives the failure information, the processor 20 in the multi-hostmanagement server 2 runs the multi-host management software 22. Theprocessor 20 reads the failure information table 26 stored in the memorydevice 23 and updates the failure information table 26.

In this invention, although the failure information including the timeinformation about the time when the host 3 receives this failureinformation is described, the failure information may alternativelyinclude only the host name and the path identifier that are part of thepath information.

(5) Path Information Table

Next, the path information table 35 managed by each host 3 will bedescribed in detail.

As shown in FIG. 8, the path information table 35 is a table formanaging the path information for each host 3 in order to constantly orperiodically monitor the statuses of paths P set to each host 3. Thepath information table 35 includes “PATH” fields 35A, “HBAP” fields 35B,“IVOL” fields 35C, “STRG” fields 35D, “CHAP” fields 35E, “EVOL” fields35F and “STAT” fields 35G.

Since the “PATH” fields 35A, the “HBAP” fields 35B, the “IVOL” fields35C, the “STRG” fields 35D, the “CHAP” fields 35E, the “EVOL” fields 35Fand the “STAT” fields 35G are fields corresponding respectively to theabove-described “PATH” fields 25A, “HBAP” fields 25C, “IVOL” fields 25D,“STRG” fields 25E, “CHAP” fields 25F, “EVOL” fields 25G and “STAT”fields 25H, their descriptions will be omitted.

(6) Function of Multi-Host Management Software

Next, the functions of the multi-host management software 22 forrealizing the multi-host management server 2 of the storage system 1according to this invention will be described below.

First, FIG. 9 is a function block diagram showing the multi-hostmanagement software 22.

It is obvious that the processor 20 in the multi-host management server2 executes the multi-host management software 22 based on functionblocks (which will be described later) in the multi-host managementsoftware 22.

The multi-host management software 22 includes at least a failureinformation reception section 220, a screen display section 221, afailure information reception queue 222, a host refresh queue extractionsection 223, a host refresh queue 224 and a host refresh executionsection 225.

The failure information reception section 220 functions, when receivingfrom each host 3 the failure information including the host name, thepath identifier and the time when the host 3 receives the failure, tostore the failure information in the later-described failure informationreception queue 222. The failure information reception section 220registers the received failure information in the failure informationtable 26 stored in the memory device 23. Then the failure informationreception section 220 refers to the host information table 24 in thememory device 23 and sets a non-update flag “1” in the host informationtable 24 field corresponding to the host 3 that has transmitted thatfailure information.

The screen display section 221 functions to report the statuses of thehosts 3 to a user. Specifically, the screen display section 221 refersto the host information table 25 after the reception of the failureinformation and reports to the user the status, i.e., whether or not thehost refresh has been executed for the relevant host 3. In other words,the screen display section 221 reports to the user the status indicatingwhether or not the path information for the relevant host 3 isup-to-date.

As a method for reporting the above status to the user, the processor 20in the multi-host server 2 performs display processing such as adding acertain mark on icons showing the hosts 3 on the management screen S1 ofthe multi-host management server 2. Alternatively, the processor 20 inthe multi-host server 2 may provide a message to the user on themanagement screen S1 of the multi-host management server 2.

FIG. 10 shows an example of a method for reporting to the user where theicons displayed in the management screen S1 of the multi-host managementserver 2 is used.

In FIG. 10, the top icon IC1 and the bottom icon IC3 indicate the hosts3 in the normal status. Meanwhile, an icon ICN is added to the centericon IC2, the icon ICN indicating that a report of failure informationhas been received for the relevant host 3 but the host refresh has notbeen executed (i.e., the path information for the relevant host 3 is notup-to-date). In addition, the message “Not updated” is displayed withthis center icon IC2.

As described above, if the non-update flag indicating that the hostrefresh has not been executed (i.e., the path information is notup-to-date) is set in the host information table 24, icons like thecenter icon IC2 and ICN in FIG. 10 are displayed. On the other hand, ifthe non-update flag is not set or an update flag indicating that thehost refresh has been executed (i.e., the path information isup-to-date) is set in the host information table 24, an icon like thetop icon IC1 or the bottom icon IC3 in FIG. 10 is displayed.

For example, the hosts 3 are displayed in a tree structure in a hostscreen S2 on the management screen S1 of the multi-host managementserver 2 as shown in FIG. 11. If the identifier for the host 3 indicatedin the failure information is “2,” the user can check the non-updatestatus with the integrated path information table 25 in this host screenS2. When the user clicks the icon IC2 of the “Host-2,” only pathinformation for this Host-2 in the integrated path information table 25is output in a path information list screen S3.

As shown in FIG. 12, if the identifier for the host 3 indicated in thefailure information is “2,” the user can check the failure informationtable 26 in the host screen S2 on the management screen S1 of themulti-host management server 2. When the user clicks the icon IC2 of the“Host-2,” the failure information table 26 including this Host-2 isoutput in a failure information list screen S4.

Referring back to FIG. 9, the failure information reception queue 222functions to register time information such as the time when the failureinformation is received from the host 3 and the failure detection time.

The host refresh queue extraction section 223 functions to extractfailure information that has been kept for a predetermined time periodafter the failure reception time and to retrieve the latest failureinformation from plural pieces of failure information pertaining to acommon host. The host refresh extraction section 223 also functions toregister host-indicating information of this common host in thelater-described host refresh queue 224. In addition, the host refreshqueue extraction section 223 functions to delete, after the aboveregistration in the host refresh queue 224, all failure informationrelating to the common host from the failure information reception queue222. In this invention, although information about a host name(hereinafter referred to as host name information) is utilized as thehost-indicating information, other information may be utilized as longas the multi-host management server 2 can identify the host 3.

The host refresh queue 224 functions to store the host name informationof the host as a host-refresh target (which is a host-refresh target).The host 3 indicated in the failure information and stored in this hostrefresh queue 224 is a target for the host refresh processing that willbe executed by the later-described host refresh execution section 225.

The host refresh execution section 225 functions to execute the hostrefresh. When the host refresh execution section 225 acquires the hostname information registered in the host refresh queue 224, threads T1 toTn are run. The number of the running threads T1 to Tn is limited. Thenumber of the threads T1 to Tn is set so that a host refresh thatupdates many pieces of path information simultaneously in a short timeperiod will not be executed for more than one host 3.

The host refresh execution section 225 executes the host refresh on therelevant host 3. In other words, the integrated path information table25 in the memory device 23 is up-to-date. For example, suppose that afailure occurs on a certain path PF as shown in FIG. 7. Then the statusof the host 3 managing this path PF is set to “offline” in theintegrated path information table 25.

The host refresh is preferably executed on each host 3, but may beexecuted on each path P.

It should be noted that only the multi-host management software 22functions that technically characterize this invention are shown in FIG.9; functions the multi-host management software 22 would normally beequipped with are not shown. The functions in this invention areseparated only logically, and may physically share a common area.

(6-1) Failure Information Reception Processing

FIGS. 13 and 14 are flowcharts showing failure information receptionprocessing that is performed by the multi-host management server 2 inorder to realize the host refresh of this invention. The processor 20 inthe multi-host management server 2 performs this failure informationreception processing based on the multi-host management software 22.

More specifically, the processor 20 of the multi-host management server2 stands by until it receives failure information about a path P fromeach host 3 (S100).

When failure information about a path P is transmitted from a certainhost 3, the processor 20 receives this failure information via thefailure information reception section 220 (S101).

When the processor 20 registers the received failure informationtogether with the reception time in the failure information receptionqueue 222 (S102), the failure information reception section 220 sets anon-update flag “1” in the field corresponding to the host 3 indicatedin this failure information in the host information table 24 (S103).

The processor 20 operates the host refresh queue extraction section 223to refer to the reception time for each piece of failure informationstored in the failure information reception queue 222 and judges whetheror not there is failure information that has been kept for apredetermined time period from the relevant reception time (S104). Here,the predetermined time period indicates the time period that has elapsedsince the host refresh queue extraction section 223 received the failureinformation, and may be about 10 to 20 seconds.

If it is determined that there is the failure information that has beenkept for the predetermined time period from the reception time (S104:YES), the processor 20 operates the host refresh queue extractionsection 223 to check whether or not plural pieces of failure informationfor the same host 3 as that in the received failure information arestored in the failure information reception queue 222 (S105).

If the processor 20 finds that plural pieces of failure information forthe same host 3 are stored in the failure information reception queue222 (S105: YES), the processor 20 retrieves failure informationcontaining the latest reception time from the plural pieces of failureinformation for the same host 3 (S106). The processor 20 then registersthe host name information for the host having the retrieved failureinformation in the host refresh queue 224 (S107).

Then the processor 20 deletes all the failure information for the samehost 3 from the failure information reception queue 222 (S108) and againstands by until it receives failure information about a path P from thehosts 3 (S100).

On the other hand, if the processor 20 finds that plural pieces offailure information for the same host 3 are not stored in the failureinformation reception queue 222 (S105: NO), the processor 20 registersthe host name contained in the received failure information in the hostrefresh queue 224 (S109).

Then the processor 20 deletes the failure information for the host 3from the failure information reception queue 222 (S110) and again standsby until it receives failure information about a path from the hosts 3(S100).

If the processor 20 determines that there is no failure information thathas been kept for a predetermined time period from the reception time instep S104 (S104: NO), the processor again stands by until it receivesfailure information about a path from the hosts 3 (S100).

Now, an example of the above-described failure information receptionprocessing will be described more specifically referring to FIG. 15. InFIG. 15, the failure information reception queue 222 stores four piecesof failure information: failure information having the host name “1,”the path identifier “1” and the reception time “May 1, 10:15:20” (whichwill be referred to as “failure information F1”); failure informationhaving the host name “2,” the path identifier “2” and the reception time“May 1, 10:15:21” (which will be referred to as “failure informationF2”); failure information having the host name “1,” the path identifier“1” and the reception time “May 1, 10:15:23” (which will be referred toas “failure information F3”); and failure information having the hostname “3,” the path identifier “3” and the reception time “May 2,5:10:20” (which will be referred to as “failure information” F4).

The processor 20 constantly or periodically monitors the failureinformation reception queue 222 with the host refresh queue extractionsection 223. When a predetermined time period has elapsed from thereception time of the failure information F1, the processor 20 extractsthe failure information F1 from the failure information reception queue222.

Then, the processor 20 checks whether or not failure information withthe same host name as host name “1” in the failure information F1 isstored in the failure information reception queue 222 (i.e., checkswhether or not other failure information with the host name “1” isstored in the failure information reception queue 222). Subsequently,the processor 20 determines that the host name in the failureinformation F3 is the same, i.e., “1.”

The processor 20 extracts the reception time in the failure informationF3, compares the reception time (May 1, 10:15:20) contained in thefailure information F1 with the reception time (May 1, 10:15:23)contained in the failure information F3, and selects the latest failureinformation.

Here, the processor 20 determines that the failure information F3 is thelatest failure information and registers host name information H3 (whichis “Host=1” in this example) in the failure information F3 in the hostrefresh queue 224.

It should be noted that when the host refresh queue extraction section223 registers the host name information H1 to H3 in the host refreshqueue 224, it may register the failure information F1 to F4 itself ormay register only the host name information H1 to H3 included in thefailure information. In short, information that shows which host 3 is ahost refresh target can be registered in the host refresh queue 224.

Once the processor 20 registers the host name information H3 to the hostrefresh queue 224 based on the latest failure information F3, thefailure information F1 and the failure information F3 are no longernecessary, so the processor 20 deletes these failure information F1 andfailure information F3 from the failure information reception queue 222.

As a result of the above processing that the processor 20 operates thehost refresh queue extraction section 223 to perform, the failureinformation F2 and the failure information F4 are left in the failureinformation reception queue 222, and the processor 20 continues tomonitor the failure information reception queue 222.

The processor 20 can recognize which host requires the update of pathinformation based on the host name information H1 to H4 stored in thehost refresh queue 224.

(6-2) Path Information Update Processing

Next, the path information update processing in which the processor 20runs the threads T1 to Tn to update path information will be describedbelow. The processor 20 performs this path information update processingbased on the multi-host management software 22.

Specifically, as shown in FIG. 16, the processor 20 either constantly orperiodically monitors the host refresh queue 224 (S200). When host nameinformation H is registered in the host refresh queue 224, the processor20 sequentially acquires the host name information for the threads T1 toTn (S201).

The processor 20 executes the host refresh on the host 3 having thishost name information (S202). Specifically, the processor 20 makes arequest for the host 3 indicated in the host name information acquiredby the processor 20 to acquire path information. Then the host 3 thatreceives the acquisition request for the path information acquires andtransfers the relevant path information stored in the memory device 36to the multi-host management server 2. After receiving this pathinformation, the multi-host management server 2 updates the integratedpath information table 25 stored in the memory device 23. Consequently,the path information for the host 3 indicated in the host nameinformation that was acquired for the threads T1 to Tn is up-to-date.

After the execution of the host refresh, the processor 20 deletes therelevant host name information from the host refresh queue 224 (S203),deletes the non-update flag indicating that the host refresh has notbeen executed in the host information table 24 and sets the update flag“0” (S204).

After updating the host information table 24, the process returns tostep S200 and the processor 20 again monitors the host refresh queue224.

Although the number of threads T1 to Tn provided in the host refreshexecution section 225 is preferably more than one, as shown in FIG. 9,the number of threads may also be one. In the arrangement where morethan one threads T1 to Tn are provided, each of the threads T1 to Tnperforms the processing in parallel. In addition, the number of threadsT1 to Tn may be arbitrarily set, and setting a limit to the number ofthreads can prevent many host refreshes from being executedsimultaneously in a short time period.

(7) Effects and Advantages of Embodiment

The multi-host management server in this embodiment temporarily storesfailure information in the queue and executes the host refresh after apredetermined time period elapses. With the arrangement, the multi-hostmanagement server extracts (plural pieces of) failure informationregarding a common host which has (have) been kept for a predeterminedtime period and retrieves the latest failure information from theextracted failure information. Then the multi-host management serverexecutes the host refresh based on the latest failure information forthe common host and deletes all the failure information including thislatest failure information from the queue. Accordingly, the storagesystem of this embodiment can reduce the number of unnecessary hostrefreshes. In particular, in the situation where a plurality of failuresoccurs in a short time period like the instantaneous path interruption,the load on the network can be greatly reduced.

Also, the multi-host management server in this embodiment does notexecute the host refresh soon after the reception of the failureinformation, but instead executes the host refresh after keeping thefailure information for a predetermined time period. With thisarrangement, a time lag is generated between the reception of thefailure information and the execution of the host refresh. Accordingly,in this embodiment, the multi-host management server is arranged sothat, when the host refresh has not been executed even after thereception of the failure information, a user can recognize thatsituation. This arrangement can avoid problems in which a user performsan operation using a path having a failure (e.g., disconnection) withoutbeing aware of the failure and causes trouble to occur in the host'sprocessing.

While the invention has been described with respect to a limited numberof embodiments, those skilled in the art, having the benefit of thisdisclosure, will appreciate that other embodiments can be devised thatdo not depart from the scope of the invention as disclosed herein.Accordingly, the scope of the invention should be limited only by theattached claims.

1. A management computer configured to manage a storage apparatus, andhost computers which are coupled to the storage apparatus with multiplepaths, and the host computer storing a plurality of path information,comprising: a memory storing a plurality of failure receptioninformation including failure information and refresh information; and aprocessor executing to: (A) receive a plurality of failure informationfrom two or more of the host computers for a predetermined period; (B)store the plurality of failure information of (A) to the failurereception information in the memory; (C) extract one or more of theplurality of failure information, which is received from a first hostcomputer among the two or more of the host computers; (D) retrieve thefailure information about one path from the extracted one or more of theplurality of failure information, about multiple paths; (E) register thefirst host computer via refresh information stored in the memory, wherethe refresh information indicates a host computer of which pathinformation is to be updated; (F) send a request to the first hostcomputer to acquire a status of a first path of the first host computer;(G) update a first path information in the plurality of path informationof the first host computer, based on the status of the first path; and(H) delete the one or more of the plurality of failure informationextracted in (C), from the failure reception information.
 2. Themanagement computer according to claim 1, wherein the processor extractsplural different incidences of failure information about the multiplepaths received from a common host computer, after a predetermined timeperiod, from reception times of the plural different incidences offailure information.
 3. The management computer according to claim 2,wherein the management computer manages the host computers using a hostinformation table that stores a flag indicating whether or not pathinformation is up-to-date for each host computer; and the managementcomputer further includes a screen display section that refers to theflag stored in the host information table and displays on a screen ofthe management computer a display showing, after failure informationabout a path was received from a certain host computer, whether or notpath information for the relevant host computer is up-to-date.
 4. Themanagement computer according to claim 3, wherein the managementcomputer sets a non-update flag indicating that the path information isnot up-to-date in the host information table, after the failureinformation about the path is received from the host computer.
 5. Themanagement computer according to claim 1, wherein the processor executesthe update of the path information on a limited number of threads. 6.The management computer according to claim 1, wherein the processorretrieves failure information about a path whose reception time which isthe time when the failure information is received by the host computer,is the latest from extracted plural different incidents of failureinformation about the multiple paths.
 7. A management method performedon a management computer to manage a storage apparatus, and hostcomputers which are coupled to the storage apparatus with multiplepaths, and the host computer storing a plurality of path information,wherein the management computer comprises a processor and a memorystoring a plurality of failure reception information including failureinformation and refresh information; wherein the management method isimplemented by the processor effecting operations comprising: (A)receiving a plurality of failure information from two or more of thehost computers for a predetermined period; (B) storing the plurality offailure information received in operation (A) to the failure receptioninformation in the memory; (C) extracting one or more of the pluralityof failure information, which is received from a first host computeramong the two or more of the host computers; (D) retrieving the failureinformation about one path from the extracted one or more of theplurality of failure information, about the multiple paths; (E)registering the first host computer via refresh information stored inthe memory, where the refresh information indicates a host computer ofwhich path information is to be updated; (F) sending a request to thefirst host computer to acquire a status of a first path of the firsthost computer; (G) updating a first path information in the plurality ofpath information of the first host computer, based on the status of thefirst path; and (H) deleting the one or more of the plurality of failureinformation extracted in operation (C), from the failure receptioninformation.
 8. The management method according to claim 7, furthercomprising: extracting plural different incidences of failureinformation about the multiple paths received from a common hostcomputer, after a predetermined time period from reception times of theplural different incidences of failure information.
 9. The managementmethod according to claim 8, further comprising: managing the hostcomputers using a host information table that stores a flag indicatingwhether or not path information is up-to-date for each host computer;and referring to the flag stored in the host information table, anddisplaying on a screen a display showing, after failure informationabout a path was received from a certain host computer, whether or notpath information for the relevant host computer is up-to-date.
 10. Themanagement method according to claim 9, further comprising: setting anon-update flag indicating that the path information is not up-to-datein the host information table, after the failure information about thepath is received from the host computer.
 11. The management methodaccording to claim 7, further comprising: executing the update of thepath information on a limited number of threads.
 12. The managementcomputer according to claim 7, further comprising: retrieving failureinformation about a path whose reception time which is the time when thefailure information is received by the host computer, which is thelatest from the extracted plural different incidents of failureinformation about the multiple paths.